Llumnix代码解析
AlibabaPAI/llumnix: Efficient and easy multi-instance LLM serving
Llumnix两个级别分的很清晰,Global Scheduler无法直接控制running request,
global级别
- Global Scheduler
- request分发
- migration
- 控制instance的auto-scaling
instance级别
- llumet
AlibabaPAI/llumnix: Efficient and easy multi-instance LLM serving
Llumnix两个级别分的很清晰,Global Scheduler无法直接控制running request,